Margin Distribution Controlled Boosting
نویسندگان
چکیده
Schapire’s margin theory provides a theoretical explanation to the success of boosting-type methods and manifests that a good margin distribution (MD) of training samples is essential for generalization. However the statement that a MD is good is vague, consequently, many recently developed algorithms try to generate a MD in their goodness senses for boosting generalization. Unlike their indirect control over MD, in this paper, we propose an alternative boosting algorithm termed Margin distribution Controlled Boosting (MCBoost) which directly controls the MD by introducing and optimizing a key adjustable margin parameter. MCBoost’s optimization implementation adopts the column generation technique to ensure fast convergence and small number of weak classifiers involved in the final MCBooster. We empirically demonstrate: 1) AdaBoost is actually also a MD controlled algorithm and its iteration number acts as a parameter controlling the distribution and 2) the generalization performance of MCBoost evaluated on UCI benchmark datasets is validated better than those of AdaBoost, L2Boost, LPBoost, AdaBoost-CG and MDBoost. Index Terms – Boostingm, Margin Distribution, Margin Control, Generalization.
منابع مشابه
Large Margin Distribution Learning
Support vector machines (SVMs) and Boosting are possibly the two most popular learning approaches during the past two decades. It is well known that the margin is a fundamental issue of SVMs, whereas recently the margin theory for Boosting has been defended, establishing a connection between these two mainstream approaches. The recent theoretical results disclosed that the margin distribution r...
متن کاملSupervised projection approach for boosting classifiers
In this paper we present a new approach for boosting methods for the construction of ensembles of classifiers. The approach is based on using the distribution given by the weighting scheme of boosting to construct a non-linear supervised projection of the original variables, instead of using the weights of the instances to train the next classifier. With this method we construct ensembles that ...
متن کاملA Duality View of Boosting Algorithms
We study boosting algorithms from a new perspective. We show that the Lagrange dual problems of AdaBoost, LogitBoost and soft-margin LPBoost with generalized hinge loss are all entropy maximization problems. By looking at the dual problems of these boosting algorithms, we show that the success of boosting algorithms can be understood in terms of maintaining a better margin distribution by maxim...
متن کاملEmpirical Margin Distributions and Bounding the Generalization Error of Combined Classifiers
We prove new probabilistic upper bounds on generalization error of complex classifiers that are combinations of simple classifiers. Such combinations could be implemented by neural networks or by voting methods of combining the classifiers, such as boosting and bagging. The bounds are in terms of the empirical distribution of the margin of the combined classifier. They are based on the methods ...
متن کاملBoosting the margin: A new explanation for the effectiveness of voting methods
One of the surprising recurring phenomena observed in experiments with boosting is that the test error of the generated classifier usually does not increase as its size becomes very large, and often is observed to decrease even after the training error reaches zero. In this paper, we show that this phenomenon is related to the distribution of margins of the training examples with respect to the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1208.1846 شماره
صفحات -
تاریخ انتشار 2012